Search CORE

215 research outputs found

A Survey on Bayesian Deep Learning

Author: Wang Hao
Yeung Dit-Yan
Publication venue
Publication date: 01/07/2020
Field of study

A comprehensive artificial intelligence system needs to not only perceive the environment with different `senses' (e.g., seeing and hearing) but also infer the world's conditional (or even causal) relations and corresponding uncertainty. The past decade has seen major advances in many perception tasks such as visual object recognition and speech recognition using deep learning models. For higher-level inference, however, probabilistic graphical models with their Bayesian nature are still more powerful and flexible. In recent years, Bayesian deep learning has emerged as a unified probabilistic framework to tightly integrate deep learning and Bayesian models. In this general framework, the perception of text or images using deep learning can boost the performance of higher-level inference and in turn, the feedback from the inference process is able to enhance the perception of text or images. This survey provides a comprehensive introduction to Bayesian deep learning and reviews its recent applications on recommender systems, topic models, control, etc. Besides, we also discuss the relationship and differences between Bayesian deep learning and other related topics such as Bayesian treatment of neural networks.Comment: To appear in ACM Computing Surveys (CSUR) 202

arXiv.org e-Print Archive

DSpace@MIT

Human action recognition using local spatiotemporal discriminant embedding

Author: Dit-yan Yeung
Kui Jia
Publication venue
Publication date: 01/01/2008
Field of study

Human action video sequences can be considered as nonlinear dynamic shape manifolds in the space of image frames. In this paper, we address learning and classifying human actions on embedded low-dimensional manifolds. We propose a novel manifold embedding method, called Local Spatio-Temporal Discriminant Embedding (LSTDE). The discriminating capabilities of the proposed method are two-fold: (1) for local spatial discrimination, LSTDE projects data points (silhouette-based image frames of human action sequences) in a local neighborhood into the embedding space where data points of the same action class are close while those of different classes are far apart; (2) in such a local neighborhood, each data point has an associated short video segment, which forms a local temporal subspace on the embedded manifold. LSTDE finds an optimal embedding which maximizes the principal angles between those temporal subspaces associated with data points of different classes. Benefiting from the joint spatio-temporal discriminant embedding, our method is potentially more powerful for classifying human actions with similar space-time shapes, and is able to perform recognition on a frame-byframe or short video segment basis. Experimental results demonstrate that our method can accurately recognize human actions, and can improve the recognition performance over some representative manifold embedding methods, especially on highly confusing human action types. 1

CiteSeerX

Crossref

Semi-Supervised Discriminant Analysis Using Robust Path-Based Similarity

Author: Dit-yan Yeung
Yu Zhang
Publication venue
Publication date: 01/01/2008
Field of study

Linear Discriminant Analysis (LDA), which works by maximizing the within-class similarity and minimizing the between-class similarity simultaneously, is a popular dimensionality reduction technique in pattern recognition and machine learning. In real-world applications when labeled data are limited, LDA does not work well. Under many situations, however, it is easy to obtain unlabeled data in large quantities. In this paper, we propose a novel dimensionality reduction method, called Semi-Supervised Discriminant Analysis (SSDA), which can utilize both labeled and unlabeled data to perform dimensionality reduction in the semisupervised setting. Our method uses a robust path-based similarity measure to capture the manifold structure of the data and then uses the obtained similarity to maximize the separability between different classes. A kernel extension of the proposed method for nonlinear dimensionality reduction in the semi-supervised setting is also presented. Experiments on face recognition demonstrate the effectiveness of the proposed method. 1

CiteSeerX

Crossref